Noise vs computational intractability in dynamics 



Mark Braverman Alexander Grigo 

Computer Science Department Mathematics Department 
Princeton University University of Toronto 

Cristobal Rojas 
Departamento de Matematicas 
Universidad Andres Bello * 

January 4, 2012 



Abstract 

Computation plays a key role in predicting and analyzing natural 
phenomena. There are two fundamental barriers to our ability to com- 
putationally understand the long-term behavior of a dynamical system 
that describes a natural process. The first one is unaccounted-for er- 
rors, which may make the system unpredictable beyond a very limited 
time horizon. This is especially true for chaotic systems, where a small 
change in the initial conditions may cause a dramatic shift in the trajec- 
tories. The second one is Turing-completeness. By the undecidability 
of the Halting Problem, the long-term prospects of a system that can 
simulate a Turing Machine cannot be determined computationally. 

We investigate the interplay between these two forces - unaccounted- 
for errors and Turing-completeness. We show that the introduction of 
even a small amount of noise into a dynamical system is sufficient to 
"destroy" Turing-completeness, and to make the system's long-term 
behavior computationally predictable. On a more technical level, we 
deal with long-term statistical properties of dynamical systems, as de- 
scribed by invariant measures. We show that while there are simple dy- 
namical systems for which the invariant measures are non-computable, 
perturbing such systems makes the invariant measures efficiently com- 
putable. Thus, noise that makes the short term behavior of the system 
harder to predict, may make its long term statistical behavior computa- 
tionally tractable. We also obtain some insight into the computational 
complexity of predicting systems affected by random noise. 



*MB is supported by an NSERC Discovery Grant, CR is supported by a FONDECYT 
Grant. 
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1 Introduction 

1.1 Motivation and statement of the results 

In this paper we investigate (non)-computability phenomena surrounding 
physical systems. The Church- Turing thesis asserts that any computation 
that can be carried out in finite time by a physical device, can be carried 
out by a Turing Machine. The thesis can be paraphrased in the following 
way: provided all the initial conditions with arbitrarily good precision, and 
random bits when necessary, the Turing Machine can simulate the physical 
system S over any fixed period of time [0, T] for T < oo. 

In reality, however, we are often interested in more than just simulating 
the system for a fixed period of time. In many situations, one would like 
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to understand the long term behavior properties of S when T— >oo. Some of 
the important properties that fall into this category include: 

1. Reachability problems: given an initial state xq does the system S 
ever enter a state x or a set of states XI 

2. Asymptotic topological properties: given an initial state xq, which 
regions of the state space are visited infinitely often by the system? 

3. Asymptotic statistical properties: given an initial state xq, does the 
system converge to a "steady state" distribution, and can this distri- 
bution be computed? Does the distribution depend on the initial state 
x ? 

The first type of questions is studied in Control Theory [BP07] and 
also in Automated Verification [CGP99]. The third type of questions is 
commonly addressed by Ergodic Theory [Wal82 , Pet83] . These questions in 
a variety of contexts are also studied by the mathematical field of Dynamical 
Systems [Man87]. For example, one of the celebrated achievements of the 
Kolmogorov-Arnold-Moser (KAM) theory and its extensions [MosOl] is in 
providing the understanding of question (1) above for systems of planets 
such as the solar system. 

An important challenge one needs to address in formally analyzing the 
computational questions surrounding dynamical systems is the fact that 
some of the variables involved, such as the underlying states of S may 
be continuous rather than discrete. These are very important formalities, 
which can be addressed e.g. within the framework of computable analysis 
[WeiOO]. Other works dealing with "continuous" models of computation in- 
clude [Ko91, PER89, BCSS98]. Most results, both positive and negative, 
that are significant in practice, usually hold true for any reasonable model 
of continuous computation. 

Numerous results on computational properties of dynamical systems 
have been obtained. In general, while bounded-time simulations are usually 
possible, the computational outlook for the "infinite" time horizon prob- 
lems is grim: the long-term behavior features of many of the interesting sys- 
tems is non-computable. Notable examples include piece-wise linear maps 
[Moo90, AMP95], polynomial maps on the complex plane [BY06, BY07] and 
cellular automata [W6102, KL09]. The proofs of these negative results, while 
sometimes technically involved, usually follow the same outline: (1) show 
that the system <S is "rich enough" to simulate any Turing Machine M; (2) 
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show that solving the Halting Problem (or some other non-recursive prob- 
lem) on M can be reduced to computing the feature J- in question. These 
proofs can be summarized in the following: 

Thesis 1. If the physical system is rich enough, it can simulate universal 
computation and therefore many of the system's long-term features are non- 
computable. 

This means that while analytic methods can prove some long-term prop- 
erties of some dynamical systems, for "rich enough" systems, one cannot 
hope to have a general closed-form analytic algorithm, i.e. one that is not 
based on simulations, that computes the properties of its long-term behav- 
ior. This fundamental phenomenon is qualitatively different from chaotic 
behavior, or the "butterfly effect", which is often cited as the reason that 
predicting complex dynamical systems is hard beyond a very short time 
horizon; e.g. the weather being hard to predict a few days in advance. 

A chaotic behavior means that the system is extremely sensitive to the 
initial conditions, thus someone with only approximate knowledge of the 
initial state can predict the system's state only within a relatively short 
time horizon. This does not at all preclude one from being able to compute 
practically relevant statistical properties about the system. Returning to 
the weather example, the forecasters may be unable to tell us whether it 
will rain this Wednesday, but they can give a fairly accurate distribution of 
temperatures on September I s * next year! 

On the other hand, the situation with systems as in Thesis 1 is much 
worse. If the system is rich enough to simulate a Turing Machine it will 
exhibit "Turing Chaos": even its statistical properties will become non- 
computable, not due to precision problems with the initial conditions but 
due to the inherent computational hardness of the system. This even led 
some researchers to suggest [W6102] that simulation is the only way to an- 
alyze the dynamical systems that are rich enough to simulate a universal 
Turing Machine. 

Our goal is to better understand under which scenarios computability- 
theoretic barriers, rather than incomplete understanding of the system or 
its initial condition, preclude us from analyzing the system's long term be- 
havior. A notable feature, shared by several prior works on computational 
intractability in dynamical systems, such as [Moo90, BY06, AB01], is that 
the non-computability phenomenon is not robust: the non-computability 
disappears once one introduces even a small amount of noise into the sys- 
tem. Thus, if one believes that natural systems are inherently noisy, one 
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would not be able to observe such non-computability phenomena in nature. 
In fact, we conjecture: 

Conjecture 2. In finite- dimensional systems non- computable phenomena 
are not robust. 

Thus, we conjecture that noise actually makes long-term features of the 
system easier to predict. A notable example of a robust physical system 
that is Turing complete is the RAM computer. Note, however, that to 
implement a Turing Machine on a RAM machine one would need a machine 
with unlimited storage, thus such a computer, while feasible if we assume 
unlimited physical space, would be an infinite-dimensional system. We do 
not know of a way to implement a Turing Machine robustly using a finite- 
dimensional dynamical system. 

In this paper we will focus on discrete-time dynamical systems over con- 
tinuous spaces as a model for physical processes. Namely, there is a set 
X representing all the possible states the system S can ever be in, and a 
function / : X — > X, representing the evolution of the system in one unit of 
time. In other words, if at time the system is in state x, then at time t it 
will be in state = (/ o / o • • • o f)(x) (t times). 

We are interested in computing the asymptotic statistical properties of 
S as t— s>oo. These properties are described by the invariant measures of 
the system - the possible statistical behaviors of once the systems has 
converged to a "steady state" distribution. While in general there might be 
infinitely (even uncountably) many invariant measures, only a small portion 
of them are physically relevant. 1 A typical picture is the following: the phase 
space can be divided in regions exhibiting qualitatively different limiting 
behaviors. Within each region IZi, for almost every initial condition x 6 Hi, 
the distribution of f f (x) will converge to a "steady state" distribution fa on 
X, supported on the region. We are interested in whether these distributions 
can be computed: 

Problem 3. Assume that the system S has reached some stationary equi- 
librium distribution fi. What is the probability fi(A) of observing a certain 
event A? 

In some sense this is the most basic question one can ask about the long- 
term behavior of the system S. Formally, the above question corresponds 

1 The problem of characterizing these physical measures is an important challenge in 
Ergodic Theory. 
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to the computability of the ergodic invariant measures of the system 2 (see 
Section 2). A negative answer to Problem 3 was given in [GHR11] where the 
authors demonstrate the existence of computable one-dimensional systems 
for which every invariant measure is non-computable. This is consistent 
with Thesis 1 above. 

In the present paper we study Problem 3 in the presence of small random 
perturbations: each iteration / of the system S is affected by (small) random 
noise. Informally, in the perturbed system S £ the state of the system jumps 
from x to f{x) and then disperses randomly around f{x) with distribution 
P S f^(')- The parameter e controls the "magnitude" of the noise, so that 

P e f(x)(-) -> f( x ) as e ^ 0. 

Our first result demonstrates that the non-computability phenomena are 
broken by the noise. More precisely, we show: 

Theorem A. Let S be a computable system over a compact subset M of W 1 . 
Assume P% x -\ is uniform on the e-ball around f(x). Then, for almost every 
e > 0, the ergodic measures of the perturbed system S £ are all computable. 

The precise definition of computability of measures is given in Section 
2. The assumption of uniformity on the noise is not essential, and it can 
be relaxed to (computable) absolute continuity. Theorem A follows from 
general considerations on the computability and compactness of the relevant 
spaces. It shows that the non-computability of invariant measures is not 
robust, which is consistent with the general Conjecture 2. 

In addition to establishing the result on the computability of invariant 
measures in noisy systems, we obtain upper bounds on the complexity of 
computing these measures. In studying the complexity of computing the 
invariant measures, we restrict ourself to the case when the system has a 
unique invariant measure - such systems are said to be "uniquely ergodic" . 

Theorem B. Suppose the perturbed system S e is uniquely ergodic and the 
function f is polynomial-time computable. Then there exists an algorithm 
A that computes fi with precision a in time Os,e{poly{^)). 

Note that the upper bound is exponential in the number of precision bits 
we are trying to achieve. The algorithm in Theorem B can be implemented 
in a space-efficient way, using only poly (log(l/ a)) amount of space. If the 
noise operator has a nice analytical description, and under a mild additional 
assumption on /, the complexity can be improved when computing at pre- 
cision below the level of the noise. For example, one could take p £ ^ x ^{ ) to 

2 An ergodic measure is an invariant measure that cannot be decomposed into simpler 
invariant measures. 
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be a Gaussian around f(x). This kind of perturbation forces the system 
to have a unique invariant measure, while the analytical description of the 
Gaussian noise can be exploited to perform a more efficient computation. 
We need an extra assumption that in addition to being able to compute / 
in polynomial time, we can also integrate its convolution with polynomial 
functions in polynomial time. 

Theorem C. Suppose the noise P% x ~\(') *s Gaussian, and f is polynomial- 
time integrable in the above sense. Then the computation of fx at precision 
5 < 0(e) requires time Os, £ (poly(logg)). 

As with Theorem A, we do not really need the noise to be Gaussian: 
any noise function with a uniformly analytic description would suffice. For 
the sake of simplicity, we will prove Theorem C only in the one dimensional 
case. The result can be easily extended to the multi-dimensional case. 

Informally, Theorem C says that the behavior of the system at scales 
below the noise level is governed by the "micro" -analytic structure of the 
noise that is efficiently predictable, rather than by the "macro" -dynamic 
structure of S that can be computationally intractable to predict. Theo- 
rem C suggests that a quantitative version of Conjecture 2 can be made: if 
the noise function behaves "nicely" below some precision level e, properties 
of the system do not only become computable with high probability, but the 
computation can be carried out within error S < e in time O e (poly (log -?)). 
We will discuss this further below. 

1.2 Comparison with previous work 

It has been previously observed that the introduction of noise may destroy 
non-computability in several settings [AB01, BY08]. There are two concep- 
tual differences that distinguish our work from previous works. Firstly, we 
consider the statistical - rather than topological - long-term behavior of the 
system. We still want to be able to predict the trajectory of the system in 
the long run, but in a statistical sense. Secondly, we also address the compu- 
tational complexity of predicting these statistical properties. In particular, 
Theorem C states that if the noise itself is not a source of additional compu- 
tational complexity, then the "computationally simple" behavior takes over, 
and the system becomes polynomial-time computable below the noise level. 

1.3 Discussion 

Our quantitative results (Theorems B and C) shed light on what we think 
is a more general phenomenon. A given dynamical system, even if it is 
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Turing-complete, loses its "Turing completeness" once noise is introduced. 
How much computational power does it retain? To give a lower bound, one 
would have to show that even in the presence of noise the system is still 
capable of simulating a Turing Machine subject to some restrictions on its 
resources (e.g. PSPACE Turing Machines). To give an upper bound, one 
would have to give a generic algorithm for the noisy system, such as the 
ones given by Theorems B and C. For the systems we consider, informally, 
Theorems B and C give (when the system is "nice") a PSPACE (log 1/e) 
upper bound on the complexity of computing the invariant measure. It is 
also not hard to see that PSPACEilog 1/e) can be reduced to the evaluation 
of an invariant measure of an e- noisy system of the type we consider. Thus 
the computational power of these systems is PSPACE (log 1/e). 

This raises the general question on the computational power of noisy 
systems. In light of the above discussion, it is reasonable to conjecture that 
the computational power is given by PSPACE(M), where M is the amount 
of "memory" the system has. In other words, there are ~ 2 M states that are 
robustly distinguishable in the presence of noise. This intuition, however, 
is hard to formalize for general systems, and further study is needed before 
such a quantitative assertion can be formulated. 

2 Preliminaries 

2.1 Discrete-time dynamical systems 

We now attempt to give a brief description of some elementary ergodic the- 
ory in discrete time dynamical systems. For a complete treatment see for 
instance [Wal82, Pet83, Mah87]. A dynamical system consists of a metric 
space X representing all the possible states the system can ever be, and 
and a map / : X — > X representing the dynamics. In principle, such a 
model is deterministic in the sense that complete knowledge of the state of 
the system, say x £ X, at some initial time, entirely determines the future 
trajectory of the system: x, f(x), f(f(x)), .... Despite of this, in many in- 
teresting situations it is impossible to predict any particular feature about 
any specific trajectory. This is the consequence of the famous sensitivity 
to initial conditions (chaotic behavior) and the impossibility to make mea- 
surements with infinite precision (approximation): two initial conditions 
which are very close to each other (so they are indistinguishable for the 
physical measurement) may diverge in time, rendering the true evolution 
unpredictable. 

Instead, one studies the limiting or asymptotic behavior of the system. 
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A common situation is the following: the phase space can be divided in 
regions exhibiting qualitatively different limiting behaviors. Within each 
region, all the initial conditions give rise to a trajectory which approaches 
an "attractor" , on which the limiting dynamics take place (and that can be 
quite complicated). Thus, different initial condition within the same region 
may lead in long term to quite different particular behaviors, but identical in 
a qualitative sense. Any probability distribution supported in the region will 
also evolve in time, approaching a limiting invariant distribution, supported 
in the attractor, and which describes in statistical terms the dynamics of 
the equilibrium situation. Formally, a probability measure fi is invariant 
if the probabilities of events do not change in time: n(f~ 1 A) = p(A). An 
invariant measure fi is ergodic if it cannot be decomposed: f~ 1 (A) = A 
implies fi(A) = 1 or p{A) = 0. 

We now describe random perturbations of dynamical systems. A stan- 
dard reference for this material is [Kif88]. 

2.1.1 Random perturbations 

Let / be a dynamical system on a space M on which Lebesgue measure 
can be defined (say, a Riemannian manifold). Denote by P{M) the set of 
all Borel probability measures over M, with the weak convergence topology. 
We consider a family {Q x }xeM £ P(M). By a random perturbation of / 
we will mean a Markov Chain Xf, t = 0, 1, 2, ... with transition probabilities 
P(A\x) = P{X t+1 G A : X t = x} = Qf(x)(A) defined for any x G M, Borel 
set A C M and n G N. We will denote the randomly perturbed dynamics 
P(-\x) = Qff x ) by <S e . Given fj, G P(M), the push forward under S £ is 
defined by (S*(i)(A) = J M P(A\x) dp. 

Definition 4. A probability measure /x on M is called an invariant mea- 
sure of the random perturbation S £ of / if S*fj, = fi. 

We will be interested in small random perturbations. More precisely, we 
will consider the following choices for Q £ x : 

1. In Theorems A and B we choose Q e x to be uniform on the e-ball around 
x. That is, Q% = vol \b(x,e) 1S Lebesgue measure restricted to the e-ball 
about x. 

2. In Theorem C we use an everywhere supported density for Q £ x = 
K e (x), which is uniformly analytic. In particular, the Gaussian density 
of variance e centered at x satisfies these conditions. 
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2.2 Computability of probability measures 

Let us first recall some basic definitions and results established in [Gac05, 
HR09]. We work on the well-studied computable metric spaces (see [EH98, 
YMT99, WeiOO, Hem02, BP03]). 

Definition 5. A computable metric space is a triple (X, d, S) where: 

1. (X, d) is a separable metric space, 

2. S = {si : i G N} is a countable dense subset of X with a fixed num- 
bering, 

3. d(si,Sj) are uniformly computable real numbers. 

Elements in the dense set S are called simple or ideal points. Algorithms 
can manipulate ideal points via their indexes, and thus the whole space 
can be reached by algorithmic means. Examples of spaces having natural 
computable metric structures are Euclidean spaces, the space of continuous 
functions on [0, 1] and L p spaces w.r.t. Lebesgue measure on Euclidean 
spaces. 

Definition 6. A point x £ X is said to be computable if there is a 
computable function ip : N — > S such that 

d(<p(n),x) < T n for all n G N. 

Such a function cp will be called a name of x. 

If x G X and r > 0, the metric ball B{x,r) is defined as {y G X : 
d(x,y) < r}. The set B := {B(s,q) : s G S, q G Q, q > 0} of simple balls, 
which is a basis of the topology, has a canonical numbering B = {Bi : i G N}. 
An effective open set is an open set U such that there is a r.e. (recursively 
enumerable) set E C N with U = Uie-E-^- ^ X' is another computable 
metric space, a function / : X — > X' is computable if the sets f~ l {B[) are 
uniformly effectively open. Note that, by definition, a computable function 
must be continuous. 

As an example, consider the space [0, 1]. The collection of simple balls 
over [0, 1] can be taken to be the intervals with dyadic rational endpoints, 
i.e., rational numbers with finite binary representation. Let T> denote the 
set of dyadic rational numbers. Computability of functions over [0,1], as 
defined in the paragraph above, can be characterized in terms of oracle 
Turing Machines as follows: 
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Proposition 7. A function f : [0, 1] — >■ [0, 1] is computable if and only if 
there is an oracle Turing Machine such that for any x € [0, 1], any name 
ip of x, and any n € N, on input n and oracle <p, will output a dyadic d (zT> 
such that \f(x) - d\ < 2~ n . 

Poly-time computable functions over [0, 1] are defined as follows (see 



Definition 8. / : [0, 1] — > [0, 1] is polynomial time computable if there 
is a machine M as in the proposition above which, in addition, always halts 
in less than p(n) steps, for some polynomial p, regardless of what the oracle 
function is. 

We now introduce a very general notion of computability of probability 
measures. When M is a computable metric space, the space P{M) of proba- 
bility measures over M inherits the computable structure. The set of simple 
measures 5"p(m) can be taken to be finite rational convex combinations of 
point masses supported on ideal points of M. When M is compact (which 
will be our case), the weak topology is compatible with the Wasserstein- 
Kantorovich distance: 



where l-Lip(M) denotes the space of functions with Lipschitz constant less 
than 1. The triple P(M, SWm)j W{) is a computable metric space. See for 
instance [HR09]. This automatically gives the following notion: 

Definition 9. A probability measure \i is computable if it is a computable 
point o£P(M). 

The definition above makes sense for any probability measure, and we 
will use it in Theorems A and B. One shows that for computable measures, 
the integral of computable functions is again computable (see [HR09] ) . Sim- 
ple examples of computable measures are Lebesgue measure, as well as any 
absolutely continuous measure with a computable density function. 

However, computable absolutely continuous (w.r.t. Lebesgue) measures 
do not necessarily have computable density functions (simply because they 
may not be continuous). 

Definition 10. A probability measure fj, over [0, 1] is polynomial time 
computable if its cumulative distribution function F(x) = /i([0, x\) is poly- 
nomial time computable. 



[Ko91]). 




sup 

</?el-Lip(M) 
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Polynomial time computability of the density function of a measure \jl 
does not imply poly-time computability of [i (unless P = #P, see [Ko91]). 
However, the situation improves under analyticity assumptions. In particu- 
lar, we will rely on the following result. 

Proposition 11 ([KF88]). Assume f is analytic and polynomial time com- 
putable on [0,1]. Then 

(i) the Taylor coefficients of f form a uniformly poly-time computable se- 
quence of real numbers and, 

(ii) the measure /i with density f is polynomial time computable. 

In the proof of Theorem C, we actually show that the invariant measure 
7r has a density function which is analytic and polynomial time computable. 

3 Proof of Theorem A 
3.1 Outline of the proof 

First observe that since M is compact and the support of any ergodic mea- 
sure of S £ must contain an e-ball, there can be only finitely many ergodic 
measures [J,i,fJ,2-> MiVYe)- The algorithm to compute them first finds all re- 
gions that separate the dynamics into disjoint parts. For this we show that 
for almost every e, every ergodic measure has a basin of attraction such that 
the support of the measure is well contained in the basin. More precisely, 
we show: 

Theorem 12. For all but countably many e > 0, there exists open sets 
Ai, Ajv( £ ) such that for all i = 1, N(e): 

(i) supp(/ij) C Ai and, 

(ii) for every x 6 Ai, fj, x = fa, where fi x is the limiting distribution of S £ 
starting at x. 

This is used to construct an algorithm to find these regions, which is 
explained in the Section 3.2, and the proof that it terminates (Theorem 25) 
follows from Theorem 12. 

The second part of the algorithm, uses compactness of the space of mea- 
sures to find the ergodic measures within each region, by ruling out the ones 
which are not invariant. Here we use the fact that if a system is uniquely 
ergodic, then its invariant measure is computable (see [GHR11]). This result 
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is applied to the system S e restricted to each of the regions (provided by the 
algorithm described in Section 3.2) where it is uniquely ergodic. 

The algorithm thus obtained has the advantage of being simple and com- 
pletely general. On the other hand, it is not well suited for a complexity 
analysis, because the search procedure is computationally extremely waste- 
ful. 

3.2 The Algorithm 

Proof of Theorem 12. For e > 0, let E{e) be the set of ergodic measures of 
S £ . By compactness, E(e) = {fj,i, . . . , I^n(e)} ^ s finite. For a set A, we denote 
by B§(A) = {x e M : d(x,A) < 5} the ^-neighborhood of A, and by A its 
closure. For simplicity, we assume M to be a connected manifold with no 
boundary so that, in particular 

B S (A) = {x e M : d(x, A) < 5} = B~J(A). 

It is clear that the support of any ergodic measure for S e contains the 
support of at least one ergodic measure for <S e _/i, for any h > 0. Therefore, 
the function N : e t— > N(e) is monotonic in e and hence it can have at most 
countably many discontinuities. 

Suppose iV(-) is constant on an interval containing e and e' > e. Then, 
for any % we have 

/(supp(//i(e))) C /(supp(^(e'))) 

and therefore, since e < e'\ 

5 e (/(supp(^(e)))) C int(S e /(f(supp(Ai i (e / ))))). 

Combining this observation with the following Lemma 13 shows that, if 
N(-) is continuous at e, then for any e' > e sufficiently close to e (such that 
N(e) = N(e')), it holds 

supp(//j(e)) C int(supp(/Xi(e'))). 

The sets A\ in the theorem can then be taken to be Ai = int(supp(/ii(e'))), 
which finishes the proof of Theorem 12. □ 

Lemma 13. For every i = 1, ..,N(e) 

B s (f(supp(m(e)))) = supp(/ii(e)). 
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Proof. For 5 > we have that: 

f i(B(x,5))= [ p(y,B(x,S))dn(y)= f vol(B(x,5)\B(f(y),e))d»(y). 

If d(x, /(supp(/i))) > e then clearly there is a 5 > such that fi(B(x, 5)) = 
so that 

supp(^i(e)) C B e (/(supp(/ii(e)))). 

On the other hand, if d(x,f(y)) < e for some y 6 supp(^), then for any 5 
small enough we have 

B(x,5) cB(y',e) 

for any y' € B(f(y),S). It follows that vol(S(x,<y)|S(/(a),e)) = ^ggj > 
for all s E f~ 1 (B(f(y),5)) and therefore 

/ vol(B(z, e)) d/x(y) > ^Mn(r\ B (f(y), 5))) > 

so that 

•B e (/(supp(/ii(e)))) C supp(/Xj(e)). 
Since supp(^) is closed, the claim follows. □ 

We now set the language we will use in describing the algorithm com- 
puting the ergodic measures. Fix e > 0. Let £ = {oi, o^} be a finite open 
cover of M. 

Definition 14. For any open set A C M and any 5 > let 

&(A) = {a € £ : a C f| B 5 (x)} 

denote the <5-inner neighborhood of A in £. 
Define the 5-inner iteration f m :2^—>2^ by: 

1. /in(0) = 

2. For all aG?, / in (a) = #(/(a)), 

3- /in({cH,-,flm}) =Ui<m/ia( a »)- 
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Definition 15. For any open set A C M and any 5 > let 



a ut (A) = {a €£: an B S (A) ^ $} 



denote the 5-outer neighborhood of A in £. 

Define the <5-outer iteration / out : 2^ — >• 2^ by: 



1 



/out(0) = 



2 



Forallaee, /out(a)=C t (/(a)) 



3 



/out({fll, 



}) = U<m/out(aj). 



Definition 16. An atom a £ £ is inner-periodic if 



In the following, we chose 5 < e and let ^ be a covering such that for a 
small interval around 5 and all o £ £, /i n (a) is constant and non empty. 

Definition 17. The inner orbit of an atom a € £ is defined to be 



Definition 18. A collection of atoms of £ is called inner-irreducible if all 

of them have the same inner orbit. 

Remark 19. If a collection of atoms is inner- irreducible, then everyone of 
these atoms is inner-periodic. 

Proposition 20. The inner map /i n and outer map / ou t a re computable. 

Proof. By the choice of <5, the condition a' C C\ x£a B$(f(x)) can be decided, 
which implies computability of /j n . Computability of / ou t follows by a similar 
argument. □ 

Proposition 21. For every a 6 £, we can decide whether or not a is inner- 
periodic. 

Proof. Because /i n is computable. □ 

The Algorithm. The description of the algorithm to find the basins of 
attraction of the invariant measures fa is as follows. First chose some cover 
£ as above. Then: 



Oin{a} = |J /&{a}. 
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1. Find all the inner-periodic atoms of £, and call their collection P. 



2. (Inner Reduction) Here we reduce P to a maximal subset £j rr which 
contains only inner-periodic pieces whose inner-orbits are inner-irreducible 
and disjoint. 

First compute the inner orbits {Oi, Oipi}. 
Lemma 22. If Oi n Oj ^ then there is kij such that 

(>i.-, : C ();' (),. 

Proof. Let a € Oi H Oj. Since Cm (a) is finite, it must contain an 
inner-periodic element. □ 

To compute £j rr start by setting £j rr = P. Then, as long as there are 

Oi, a.j € iirr, i i 1 j such that Oj n Oj / 0, set 

£irr := (£irr ~ {(k, Oj}) U {a fcij }. 



Lemma 23. £j rr contains only inner periodic pieces whose inner-orbits 
are inner-irreducible and disjoint. By construction, the cardinality of 
£i rr is maximal. 

Proof. At each step the cardinality of £j rr is reduced by 1, so that the 
procedure stops after at most \P\ — 1 steps. It is evident that the re- 
maining atoms have disjoint inner-orbits. Let a E £j rr and o« G 0- m (a). 
If Oj is inner-periodic, then it was eliminated during the procedure 
when compared against a, which means that a € 0in(<*i)- If a i was 
not inner-periodic, then there is some inner-periodic element cu in 
Cin(di) which was eliminated when compared to o, which implies that 
o E C?i n (cij) C Oi n (aj). This shows that 0- m (a) is inner-irreducible. 
Let a* ^ £i rr - Then a* was eliminated in the procedure, which means 
that Oi n (a*) can not be disjoint from £j rr . The cardinality of £j rr is 
therefore maximal. □ 

Remark 24. The support of any ergodic measure contains the inner 
orbit of at least one element in £j rr . 
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3. If for all Oj, dj in ^ rr , ou t(ai) n Cout(oj) = then stop and return £j rr , 
otherwise refine £ and go to (1). 

Theorem 25. For a// 6u£ countably many e, the above algorithm terminates 
and returns £j rr . Moreover, if Oi denotes the inner orbit of the i-th element 
of Cirr> i^en 5 e has exactly |^ rr | -many ergodic measures, and the support of 
each of them contains exactly one of the Oi . 

Proof. By Theorem 12 we can assume that e is such that there exist disjoint 
open sets A\, Aj^u) such that for all i = 1, ...,N(e): 

(i) supp(jUi) C Ai and, 

(ii) for every x £ Ai, [i x = fii, where \i x is the limiting measure starting 
at x. 

Therefore, each element of the list £j rr constructed in step 2, has an inner- 
orbit contained in the support of some ergodic measure. The algorithm 
terminates because of two facts: (i) for a cover £ fine enough, the inner 
orbits of two different elements of the list £, rr must be contained in the 
support of two different ergodic measures, (ii) For a cover finer than the 
minimal gap between the supports and their basins, it is guarantee that the 
outer orbits will be also disjoint. □ 

Proof of Theorem A. Use the above algorithm to construct the outer irre- 
ducible pieces. Each of them is a computable forward invariant set. The 
perturbed system S £ restricted to each of these pieces is computable and 
uniquely ergodic. The associated invariant measures are therefore com- 
putable ([GHR11]). □ 
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4 Proof of Theorem B 



4.1 Outline of the Proof 

The idea of the algorithm is to exploit the mixing properties of the transition 
operator V of the perturbed system S £ . Since V may not have a spectral 
gap, we construct a related transition operator V that has the same invariant 
measure as V while having a a spectral gap (see Lemma 28 and Proposition 



The algorithm then computes a finite matrix approximation Q of V with 
the following properties: (i) Q has a simple real eigenvalue near 1, (ii) the 
corresponding eigenvector tp can be chosen to have only non negative entries 
and (iii) the density associated to tp (see below) is L -close to the stationary 
distribution of V. 

To construct the main algorithm A, to each precision parameter a we 
associate a partition £ = Ci a ) °f the space M into regular pieces of size 
5 = l/0(poly(-^)) 1 ^ d , where d denotes the dimension of M. On input a the 
algorithm A outputs a list {woloeC °^ O (poly (^))-dyadic numbers, which is 
to be interpreted as the piece-wise constant function 



For any atom dj G Cj let °i denote its center point. The algorithm works 
as follows: 

1. Compute /(cj) with some precision e, that we will specify later: / e (cj) 
(a log(l/e)-dyadic number) 

2. For every Oj 7^ do: 

• Compute d(f € (cj), Cj) with precision e: d e (f e (ci),Cj) (also a log(l/e)- 
dyadic number). 

• set pij to be an e-approximation of v ™)g\ iff 



where m(S) (a polynomial in S) denotes the uniform modulus of 
continuity of / (see Equation 5). Otherwise put p^ = (one 
can assume all the previous numbers to be rational, and then the 
inequality can be decided). Clearly, the computation of each p^ 
can be achieved in polynomial time in log(l/e). 



29). 




d e (f e (ci),Cj) < e - m(5) - 2e - 6 
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3. Compute the unique normalize Perron-Frobenious eigenvector ip of the 
|C| x |£| matrix (pij), and output the list {w a } where w a = ip a . 

The key point is that the matrix (pij) can be seen as a representation 
of the sub-Markov transition kernel P^(x, dy) = p x {y)dy, where 

Px(y) = J^ftjl {x € a { } 1 {y € a,} . 

Proposition 31 shows that the mass deficiency of the sub-Markov approxi- 
mation P£ is uniformly small. Furthermore, we have PZ < P, and therefore 
Lemma 30 shows that the density associated to the above computed eigen- 
vector ip can be made a-close to the invariant density of P by choosing 
e < 0{5). 

One then computes a finite-dimensional approximation, which has a 
spectral gap. Moreover, this approximation is such that its invariant density 
is close to the invariant density of S e . 

4.2 Rate of convergence 

Here we essentially show that the Markov kernel P of the perturbed map S e 
has a spectral gap property. For any cover £ of M, 

1. define 

for all a, £ £, 

2. define furthermore the sub-Markov matrix Q by 

Q(ai -> aj) = Q{i -»• j) = Qij = < vol(a ) 

for any two atoms, which defines a weighted oriented graph on £, 

3. and finally, define the numbers 

N(Oi -»■ Oj) = iV(i -> j) = iVij = inf{n > 1 : > 0} € {1, 2, . . . , oo} 
for any two atoms of £. 



19 



The standing assumption in this section is that the cover £ of M is such 
that 

£irr = D Oin(a) (1) 

is non-empty. We will refer to £i rr as the inner irreducible part of £. 

Lemma 26 (Comparision lemma). The estimate 

V m (x, A n fij) > 1 {x G a,} Q™ vol(A \ fij) 

is satisfied for all x G M, any aj G £ ; and all A G B. In particular, for any 
Oj G £, and any two £o>£i C £ 

P m (x, A) > 1 {x G a*} Q™. vol(A \ aj) 
7> m (x,.4) > ^ l{x G aj Q% vol(A|aj) 
/ioW £rite /or a// x G M, A G B and m > 1. 

Proof. Let A G B, as well as cij G £ and x G Oj be arbitrary, but fixed. Then 
for any integer m > 1 and any aj G £ 

r(i,inaj) = y P m " 1 (x,dx m _i)P(x m _ 1 ,,4na J ) 

> / ^ 1 (i ) di^i)?'(vi^n(i,') 

0fce£:aj6/j n (a fc ) afe 
a fe G5:o i e/ in (a fe ) 

= X)P m ~ 1 (s,a fc )Q fc>3 - vol(A|Sj) 

afeG? 



we obtain 



P m (x, A n fij) > P(x, c fc ) Q^ 1 vol(^ | oj) 

by induction. Because x G ctj and "P(x, fife) > Qi : k we obtain the estimate 

V m (x, A n Oj) > Q-j vol(4 | fij) for all x G en, Oj G £ 
for all m > 1. □ 
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Denote for x € M and A G B by 



V(x, A) = — V V n (x, A) , Ne = max max 7V(o 7 - a*) (2) 



a new Markov transition kernel on M. By choice of £i rr the number N% is 
finite, and hence V(x, A) is a well-defined Markov transition kernel on M. 
Furthermore, let 

1 Ne 

P = ™£wX E Oy. o</?<i, (3) 

s n = l ajG4irr 

where the fact that j3 > is shown in the following lemma. 

Lemma 27 (Lower bound on B). The following (rather pessimistic) bound 
on j3 



vol (a) 
ae£ vol(Z? e 



mm 



N. 



holds, and shows in particular that f3 > 0. 
Proof. From its definition in (3) we have 

" V O n . > _ min V O . 



s n=l ajSSirr S ajS^ir 



Furthermore, due to the lower bound 



JO if dj $ /in(ai) . vol(a) 



_g if dj G /in(ai) ae? vol(£, 

the above can be further estimated from below by 



1 1 a N s 

? * E ^ > W™™ E ^ ^ lv7#^ 



Uj c^irr u j ^=sirr 



□ 



Lemma 28 (Doeblin condition for V). There exists a probability measure 
ip on M such that mi x &M V{x, A) > (3 f(A) holds for all AgB. 
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Proof. By Lemma 26 we have for any cij G £ 

i)>l{xa,} £ vol(A I o,-) 

for all x £ M, A € B and all n > 1. Therefore, 

P(s, A) = ^ X>"(x, A) > 1 {x € a,} ^ £ £ Qfcj vol(A | a,) 

* n=l ^ n=l ajeCi rr 

1 N( 

>l{xea fc }min- V Q?- vol(A|aj) 

for all afe € (, and all x. And since x is contained in at least one element of 
£ we obtain the bound 

1 ^ 

P(x,A)>min — ^ ^ Q^voliAlaj) 

s n=l a^GCirr 

uniformly in x E M and 

Now define the measure tp on M by 

1 ^ 

^)=min — £ £ - vol(A | flj) • 
The choice iV^ implies that 

<K& fc ) = mmi^^ Q?. vol(A | a,) = mm -L £ Q" fc > mm -L > 

S n=l 0,'Gfirr n = 1 

for any 6 £i rr - In particular, the measure ip is non-trivial. Therefore, 

1 ^ 

/^(A)=^(A), l>/3 = V(M)=min — £ £ Q?j > , 

S n=l ajGgirr 

which finishes the proof. □ 
Proposition 29 (Invariant measure for V and T 7 ; rate of convergence). 
1. The Markov kernel V has a unique invariant probability measure tt. 
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2. For any initial measure /j,q on M the estimate 

|/U P n -vr| TV < (1-/3)™ 

holds for alln > 1, where (3 is as in Lemma 28, and the total variation 
norm of a signed measure v is defined to be |^|tv = su P|A|<i 

3. The Markov kernel V has a unique invariant probability measure, which 
is also given by tt. 

Proof. The first two claims are immediate consequences of the Doeblin con- 
dition for V proved in Lemma 28. 

If [i is an invariant probability measure for V, then it clearly must be 
invariant for V . Therefore the first of the three claimed statements implies 
that V can have at most one invariant measure, which must be tt. 

By invariance of tt for V and VV = VV the identity ttV = ttV V = 
ttVV holds for all n > 1, so that the second of the claimed expressions 
shows that txV = X\m. n ^ tOQ TrVV = tt, which finishes the proof. □ 



4.3 Approximation of the stationary distribution 

In what follows we assume that the perturbed system has a unique ergodic 
measure and that its support is strictly contained in M. Moreover, we 
assume that V has a spectral gap < 9 < 1 in the following sense. Let 
N > 1 be fixed, and denote by 



V 



N 1 

y~v k 



(4a) 



k=l 



the Markov kernel corresponding to the sampled chain with uniform sam- 
pling distribution on {1, . . . , N}. The spectral gap property that we assume 
is that for any two probability measures v and v' 



IvV" -u'V n \ T v <C(1 



\n i f\ 
) \V — V TV 



(4b) 



for all n > 1, where C is some constant that does not depend on the choice 
of the measures v and v' . 

Lemma 30 (Sub-Mar kovian approximation). Let Q be a sub-Markov kernel 
on M such that Q < V , and introduce 



K- = inf 



V(x,M) - Q(x,M) 



K + 



sup 



V(x,M) - Q{x,M) 
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which thus satisfy < K- < k + < 1. Let ijj be a probability measure on M, 
and let A 6l be such that Xtjj = tjj Q. Then the estimates 



N 



C r "1 

< k_ < 1 — A < k + < 1 and \ir — V'Itv — ~K 1 — Iv ^ — K " 



fc=i 



hold. 



Proof. Since ir is stationary for T 5 , it is also stationary for V . Therefore, we 
have that (ir — ip) V — (it — ip) = ip — ip V, and hence 



n-l 



(vr _ ^) -p n _ _ ^) = Y^{ip - ip V) V 



k=0 



for any n > 1. Since ^ and ^ are probability measures on M, the assumed 
spectral gap implies 



n-l 



(vr-V)P n -(vr-^)|TV <]TC(l-#) fc |V>-^|TV <j\i/>-iI>'P\tv 



k=0 



for all n > 1, and hence \ir — V'Itv < ^ IV' — V'^'Itv by passing to the limit 
n — > oo. 

Furthermore, since Q is sub-Markovian and Q < V we have that 



A = A ip(M) = Q] (M) = [V> P] (A-/) - [ip V] (M) - [V> Q] (M) 
= 1 - y ^(efe) V(x,M) - Q(x,M) 

and hence 

0<1-k+<A<1-k_<1 

follow for the upper and lower bounds on A. 
Finally, note that with 

N N 



fe=i 



fc=i 



it follows that 

iPV-ip = ipV-iJ}Q + 'ipQ-ip = (ipV -ipQ) - (1 - \)ip 
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where i^V — ipQ and (1 — A) ip are positive measure of equal total mass 1 — A. 
And since V is a Markov operator the trivial bound \ip — ^/>'P|tv < 1 — A 
given by the total mass implies 



N 

|vr-V|TV<§|V'-V'^|TV<^(l-A) = ^[l-^^A fc 



k=l 



< — 



r r N 1 

k=l 



and finishes the proof. □ 

4.4 Time complexity of computing the ergodic measures 

For sake of simplicity, from now on we assume M to be the d-dimensional 
cube [0, l] d and Cs = {^i, fli^i} to be a regular partition of diameter 5. 
Because of regularity, all the atoms have the same volume vol (a) = 5 d . The 
volume of any e-ball will be denote by vo\(B e ). 

Let £ be a partition of diameter 5. We now describe how to construct 
a sub-Markov kernel P£ with a prescribed total mass deficiency. will 
consist of a |£| x |£| matrix whose entries will be either or p = . 
If the map / is poly-time computable, then each entry can be decided in 
polynomial time. 

Let 

m(5) := B up{d(/(x), f(y)) : x,y € M, d(x, y) < 5} (5) 
be the uniform modulus of continuity of /. Then of course we have that 

m{5) \ as 5 -> 

and 

d(f(x),f(y)) < m(5) whenever d(x,y) < 5. 
Proposition 31. 

sup \P(x,M) - PHx,M)] < C M m{5) + 25 + 2e 

where Cm is a constant which depends only on the manifold M . 
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Proof. Let x € o. Denote the density of P£(x,M)] by p x {y)- 

P(x,M)-P{(x,M) =J2 [ d y[Px(y) - Px(y)} 

= E ^1^) [ vol (/(^) e n a,) - vol(a,)l {d e ( Cj ,S £ (ci)) < e - m(5) - 5 - 2e}] 

^ E ^TJT-^ 1 < A > ~ 1 ™ ( where A = K c i> 5 *(<*)) < e + m(S) + 5 + e} 
and B = {d(cj , <S e (cj)) < e — m(5) — 5 — 3e}) 

= T ^WT 1 { £ " - * - 3e < d( Cj ,S £ ( Ci )) < e + m{5) + 6 + e} 
vol(±f e j 

Vol(S £+m ( < 5) +2( 5 +<: ) - VOl(B E _ m ( (5 )_3 e _2 ( j) 



vol(B e ) 
^ _ m(5) + 25 + 2e 
e 

□ 

5 Proof of Theorem C 
5.1 Outline of the Proof 

In the proof of Theorem B, we approximated the transfer operator by a 
finite matrix {pij}, which corresponded more or less to the projection of 
the operator P on a finite partition £. In this sense, this discretization 
was a "piece-wise constant" approximation of the operator P. In order 
to increase the precision of this approximation, and hence the precision 
a = 2~ n of the computation of the invariant measure, we are forced to 
increase the resolution of the partition £. This makes the size of the finite 
matrix approximation of P grow exponentially in n. 

The idea in getting rid of this exponential growth, is to use a fixed 
partition which will depend only on the noise K e , and not on the precision 
n. Instead of using a "piece-wise constant" approximation, we represent the 
operator P exactly on each a € C by a Taylor series. The regularity of the 
transition kernel implies the corresponding regularity of the push-forward of 



2(3 



any density. More precisely, if denotes the density at time t, then 



oo 



p 




£l{*ea}2>S {x-x a ) k 



a&C k=0 



P 



( t+1 \x) 



E^^E^c 



) 



OiGC 1=0 




aj,m a j 



E^S™ / (y~X aj ) 




d l 2 K f (y,x ai ) 



dy . 



provides an infinite matrix representation of the transition operator in terms 
of its action on the Taylor coefficients of the densities. See Section 5.2. 

The assumed analytic properties of the transition kernel allow us to 
truncate the power series representation of the densities (see Lemma 38), 
and represent the corresponding truncation Pjy of the transition operator as 
a finite matrix. 

We then show that the size of this matrix depends linearly on the bit-size 
n of the precision of the calculation of the invariant density (see Theorem 
36 and Proposition 39). This is where the analytic properties of the kernel 
K £ are used. The actual algorithm iterates P$ p of some initial density p 
sufficiently many times (linear in the bit size precision), and then uses the 
resulting vector to compute n significant bits of the the invariant density 
ir(x) at some point x by using the Taylor formula 



fc=i 

This shows that the invariant density is an analytic poly-time com- 
putable function, and Proposition 11 finishes the proof. 

We now give the technical details. As mentioned in the introduction, we 
consider only the one dimensional case. 

5.2 A priori bounds 

The standing assumptions on K e (y,x) in this section are: 
Assumption 32 (Uniform regularity of the transition kernel). 

(i) There exists constants 

C > and 7 > such that \8$K e (y,x)\ < Ckle^ k for all k G N and 
all x,y G M. 



N 
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(ii) K £ (f(-),x) is poly-time integrable. 

Since e will be fixed, we will denote the kernel K e (f(y),x) just by 
Kf(y,x) to shorten the notation. 

Let p be a probability measure on M. Recall that the transition operator 
is given by 

pP(dx) = dxp(x) , p(x) = / (j,(dy)K E (f(y),x) , (6) 

J M 

and shows that pP(dx) has a density for any probability measure p. 
Lemma 33 (A priori regularity of p). 

(i) The estimate sup^g^/ \d k p(x)\ < C k\ e' yk holds for all fc£N. 

(ii) For any partition (, satisfying e 7 diam£ < 1 the density p admits for 
all x the series representation 

oo 

p{x) = ^2^{x G a}^2p a ,k(x - x a ) k where |p a>fe | < C e lk , 

which converges absolutely and exponentially fast, uniformly in x. 

Proof. By definition of p(x) we have d k p{x) = f M p(dy) ^K £ (f(y), x) for 
all k S N and all x € M. Therefore, the claimed estimate on sup^g^ d k p(x) 
follows from Assumption 32. Using this result the second claim follows from 
Taylor's theorem. □ 

Our method will further rely on the following assumption: 

Assumption 34 (Mixing assumption). 

(iv) There exists constants C > and 9 < 1 such that 

< CO* |/i - H T v < 2 C 9 l for alH > 1 

holds for any two probability measures p and v. 

Under Assumption 34 the Markov chain generated by P has a unique 
invariant measure, which we denote by Tr(dx). Furthermore, it also follows 
that this measure has a bounded density with respect to the volume mea- 
sure on M. By slightly abusing notation we will denote the density of the 
stationary measure by ir(x). 

We now show the two facts above follow from assumption (i). 



pP\dx) _ uP l {dx) 
dx dx 
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Lemma 35 (Examples for K £ ). Part (i) of Assumption 32 is automatically 
satisfied, if the kernel K £ (y, ■) is analytic, uniformly in y. If in addition 
there exist constants < c_ < c + such that c_ < K e (y,x) < c+ ; then 
Assumption 34 is satisfied. 

Proof. If K e (y, •) is analytic, then K e (y, •) admits an everywhere converging 
power series representation, which by compactness of M implies that there 
exist C{y) > and -y(y) > such that sup xeM \d$K £ (y,x)\ < C(y) k\ e^ k 
for all k S N. The assumed uniformity of the analyticity simply means that 
C{y) and 7(y) can be uniformly chosen with respect to y, which proves the 
first part. 

Now assume the existence of c± as stated in the second part. Let ji and 
v be two probability measures on M. From the definition of the transition 
operator (6) 



\iiP{dx) - vP(dx)]A(x) 



M 



dx / [fj, (dy) - v (dy)\ K } (y, x) A{x) 
m Jm 



dx / \p (dy) - v (dy)] [K f (y, x) - c_] A(x) 

M JM 

dx [ [/i (dy) - v (dy)] R f^ x J~ c ~ A{x) 
m Jm v 



1- \M\c- < 1 



for any bounded function A: M — > R. The assumed lower bound implies 
that K f( y '^ c ~ i s a probability density (with respect to x), and hence 

- ^P|tv < 9 — ^|tv 

follows. Iterating this inequality we obtain 

|//P* - i/P*|tv <0*\h- v\t V <26 l 

for all i > 1 and any two probability measures \x and v. From the upper 
bound on the kernel it follows 



fiP(dx) uP(dx) 



dx 



dx 

and hence 

nP l (dx) vP l (dx) 
dx dx 

as was to be shown. 



= sup 

oo ieM 



M 



[n(dy) -v(dy)]K f (y,x) 



<c + \n- z/|tv 



< c+ -vP 



TV 



\fJ> ~ ^Itv 



□ 
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Because of Lemma 33 we can consider only densities satisfying the a 
priori bound, and we will do so. The density of at time t of a probability 
measure will be denoted by p^(x). 

Using Lemma 33 we know that for any time t, such a density can be 
written as 

oo 

pW(x)=^l{xea}^4(,-x/ 

and therefore 

( S ) = PpW (*) = / p(y) K f (y, x) dy 

JM 

= I>S™ I (y-x a] ) m K f (y,x)dy . 

aj,m ^ a J 

Expanding Kf gives 

oo 

We can therefore represent the operator P, acting on densities satisfying 
the a priori regularity, exactly by a matrix of size |£| x whose entry 
p(tH,aj) j g ^ n ^ urn an infinite matrix with matrix elements 

p(^)(/, m )= /" (y-x^r ^f'^ dy, i,m>0. (7) 
5.3 Truncation step 

The idea here is to truncate the operator P, represented by the infinite 
matrix (7), by dropping the higher order terms. Recall Lemma 33 and 
corresponding representation of densities 

oo 

with \p a k\ < C e lk for all o, k, where e 7 diam£ < 1. For any N > 1 we define 
the truncation projection 

N 

U N p(x) :=^2^2p a ^ k (x - x a ) k , p N (x) = p(x) - ILvp(x) , (8a) 



30 



where pjy denotes the remainder term. Correspondingly, we define the trun- 
cated transition operator by 



P N := U N PU N , 



(8b) 



whose matrix elements are given by (7), with I, m = 1, . . . , N. A schematic 
representation of one application of the operator Pn is shown in Fig. 1. 



ICI 



a 




N 




Figure 1: Graphical representation of the equation Pn P$ = P^n~^ ■ 

The following theorem states the desired linear dependence of both the 
number of iterations t and the number of Taylor coefficients iV in the preci- 
sion parameter n. 

Theorem 36. There exist linear functions t(n) and N(n) such that 

||vr-P>|| oo <2-« 
for all n€N, uniformly in p. 

Proof. We will need the following lemmas: 

Let p be a probability measure with a density of the type of Lemma 33. 
Denote the densities of pP l by for all t > 0. 



Lemma 37. Then 



t-i 



U^-P^^^PnQnP^ 1 ' 



holds, where Qn '■= ILvP — Pn = ILvP(l — II 



N 



Proof. Observe that the identity = Pp^ ^ can be rewritten as IljvpW = 

P N p [t - l) + QNP {t - l \ so that U N p® = P'nP^ + EI'JoPn QNP {t ' X - s) follows 
by iteration. □ 
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Lemma 38 (Truncation bounds). 
( i ) For any bounded function rj the estimate 

[e'MiamC] Ar+1 



1 + |M| C ■ 



1 — e^diam £ 



holds for all N . 
(ii) For any bounded function n the estimate 

[ e TdiamC] Ar+1 



Pfrl Woo < 



1 + \M\C- 



1 — e^diam £ 



holds for all s > and all N . 
Proof. By definition 



U N Pn(x)= [ dy V (y)U x N K f (y,x) 
Jm 



where the superscript x indicates that Un acts on the x-variable in KAy, x). 
Therefore, 



KnPv Woo < ll^lloo™ J dy\U x N K f (y,x) 



< 



< 



1 + max / dy\(l-U N ) x K f (y,x) 



N+l 



1 + IMI C 



[e 7 diam £] 
1 — e 7 diamC 



where used the normalization J dy Kf(y,x) = 1 of the kernel, and the a 
priori bound on the Taylor coefficients of Kf(y,x) with respect to x. 
In particular, it follows that 



II PNV Woo = II nArPITv?/ ^ < 
and therefore 



1 + IMI C 



[e^diam (] N+1 



^11 < 



1 + IMI C 



1 — e^diam £ 
[e^diamC]^ 1 - 



U N n 



1 — e^diam £ 

for all s > by iteration, which finishes the proof. 



ITv?? 



□ 
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Proposition 39. Let p be an arbitrary admissible density. For all N, t 



vr-i^ll < 



1 + \M\q N 



\M\q N t 



QNt + q N + 2C9 t 



where we set q N = C 



Proof. Observe that for all t the identity P 1 n QnP = P^ P ' {p — Pn) holds by 
the definition of the Pn and Qn, and therefore 



PnQnP 



< 



< 



p^p(p-pat)|| 00 

[ e TdiamC] iV+1 
1 — ei'diam £ 



1 + \M\C- 



U N P(p- p N ) 



1 + \M\ C 



[e^diamC]^ 1 
1 — e 7 diam £ 



s+l 



P- PN 



holds for all s > and all N, by Lemma 38. Using the a priori bounds on 
the density p stated in Lemma 33 we obtain 



PnQnP I 



< 



1 + |M| C ■ 



[e'MiamC] jV+1 1 s + l [e^diam fl^ 1 
1 — e^diam Q J 1 — e^diam £ 



for all admissible densities p, and all N. 

Combining this uniform estimate with Lemma 37 



t-i 



iW<) - iV 0) 



(t-l-s) 



s=0 
t-l 

r l 



[e 7 diam C] iV+1 ] 5+1 _ [e^diam (] N+1 



M 



+ C 



1 — e^diam Q 
[e^diam C]^ 1 



1 — e^diam £ 



1 + IMI C 



1 — e^diam £ 
[eTdiamC]^ 1 ]* 



1 — e^diam £ 



and therefore 



+ 



< 



1 ^ [e 7 diamC] 7V+1 
|M| 1-eTdiamC 



7T — p 



N+l 



(t) 



1 + IMI C 



[e 7 diam £] 
1 — e^diam Q 



+ c ydiam C r' 2C „, 
1 — e^diam (" 

for all iV, i and any admissible density p. 
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Finally, the inequality (1 + ^)* — 1 < e*^t^, which holds for all £,t > 0, 
implies 



7T - P l NP || ^ [l + |M| 9jV 

[e 7 diam Q 



e 

N+l 



M\q N t 



q N t + q N +2C0 1 



q N = C 



1 — eTdiam ( 



which finishes the proof. 



□ 



Now we are in a position to finish the proof of Theorem 36. Fix k > 0. 
Note that the particular choices 



t 



k 



log 



1 ' 



fc + logfc V[log(|M| C) — k] log l- £ 7diamC " l0 § 



log' 



log- 



eTdiam £ ° eTdiam f 

combined with the estimate in Proposition 39 shows 



log 



l 

eTdiam £ 



tt-P^pH < l + |M| gjv 



,\M\q N t 



q N t + q N + 2C9 t 



< 



1 + \M\q N t 



\M\q N t 



q N t + q N t + 2Ce t 



< C[3 + 2e] e~ fc < 8.5Ce~ k 
so that setting k = n + log [8. 5 C] shows that these linear functions 

t{n) 



1 log[8.5C] 
n + 



logi 



N(n) 



log^ 

V[log(|M| C) - log(8.5 C) - n] 



log 



i 



■ n + 



e^diam £ 



log 



eTdiam £ 



| lQ g i^diamC -iogiog^ | 2 1og[8.5C] i 

lQ g i^dSmT b S ■ 



eTdiam f 



will suffice for || 7r — Pjjp || < 2 n for all n. 



□ 
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